25 research outputs found
The Benefit of Multitask Representation Learning
We discuss a general method to learn data representations from multiple
tasks. We provide a justification for this method in both settings of multitask
learning and learning-to-learn. The method is illustrated in detail in the
special case of linear feature learning. Conditions on the theoretical
advantage offered by multitask representation learning over independent task
learning are established. In particular, focusing on the important example of
half-space learning, we derive the regime in which multitask representation
learning is beneficial over independent task learning, as a function of the
sample size, the number of tasks and the intrinsic data dimensionality. Other
potential applications of our results include multitask feature learning in
reproducing kernel Hilbert spaces and multilayer, deep networks.Comment: To appear in Journal of Machine Learning Research (JMLR). 31 page
Object segmentation in depth maps with one user click and a synthetically trained fully convolutional network
With more and more household objects built on planned obsolescence and
consumed by a fast-growing population, hazardous waste recycling has become a
critical challenge. Given the large variability of household waste, current
recycling platforms mostly rely on human operators to analyze the scene,
typically composed of many object instances piled up in bulk. Helping them by
robotizing the unitary extraction is a key challenge to speed up this tedious
process. Whereas supervised deep learning has proven very efficient for such
object-level scene understanding, e.g., generic object detection and
segmentation in everyday scenes, it however requires large sets of per-pixel
labeled images, that are hardly available for numerous application contexts,
including industrial robotics. We thus propose a step towards a practical
interactive application for generating an object-oriented robotic grasp,
requiring as inputs only one depth map of the scene and one user click on the
next object to extract. More precisely, we address in this paper the middle
issue of object seg-mentation in top views of piles of bulk objects given a
pixel location, namely seed, provided interactively by a human operator. We
propose a twofold framework for generating edge-driven instance segments.
First, we repurpose a state-of-the-art fully convolutional object contour
detector for seed-based instance segmentation by introducing the notion of
edge-mask duality with a novel patch-free and contour-oriented loss function.
Second, we train one model using only synthetic scenes, instead of manually
labeled training data. Our experimental results show that considering edge-mask
duality for training an encoder-decoder network, as we suggest, outperforms a
state-of-the-art patch-based network in the present application context.Comment: This is a pre-print of an article published in Human Friendly
Robotics, 10th International Workshop, Springer Proceedings in Advanced
Robotics, vol 7. The final authenticated version is available online at:
https://doi.org/10.1007/978-3-319-89327-3\_16, Springer Proceedings in
Advanced Robotics, Siciliano Bruno, Khatib Oussama, In press, Human Friendly
Robotics, 10th International Workshop,
Conditional Random Fields as Recurrent Neural Networks
Pixel-level labelling tasks, such as semantic segmentation, play a central
role in image understanding. Recent approaches have attempted to harness the
capabilities of deep learning techniques for image recognition to tackle
pixel-level labelling tasks. One central issue in this methodology is the
limited capacity of deep learning techniques to delineate visual objects. To
solve this problem, we introduce a new form of convolutional neural network
that combines the strengths of Convolutional Neural Networks (CNNs) and
Conditional Random Fields (CRFs)-based probabilistic graphical modelling. To
this end, we formulate mean-field approximate inference for the Conditional
Random Fields with Gaussian pairwise potentials as Recurrent Neural Networks.
This network, called CRF-RNN, is then plugged in as a part of a CNN to obtain a
deep network that has desirable properties of both CNNs and CRFs. Importantly,
our system fully integrates CRF modelling with CNNs, making it possible to
train the whole deep network end-to-end with the usual back-propagation
algorithm, avoiding offline post-processing methods for object delineation. We
apply the proposed method to the problem of semantic image segmentation,
obtaining top results on the challenging Pascal VOC 2012 segmentation
benchmark.Comment: This paper is published in IEEE ICCV 201
Making CNNs for Video Parsing Accessible
The ability to extract sequences of game events for high-resolution e-sport
games has traditionally required access to the game's engine. This serves as a
barrier to groups who don't possess this access. It is possible to apply deep
learning to derive these logs from gameplay video, but it requires
computational power that serves as an additional barrier. These groups would
benefit from access to these logs, such as small e-sport tournament organizers
who could better visualize gameplay to inform both audience and commentators.
In this paper we present a combined solution to reduce the required
computational resources and time to apply a convolutional neural network (CNN)
to extract events from e-sport gameplay videos. This solution consists of
techniques to train a CNN faster and methods to execute predictions more
quickly. This expands the types of machines capable of training and running
these models, which in turn extends access to extracting game logs with this
approach. We evaluate the approaches in the domain of DOTA2, one of the most
popular e-sports. Our results demonstrate our approach outperforms standard
backpropagation baselines.Comment: 11 pages, 6 figures, Foundations of Digital Games 201
Deep Thermal Imaging: Proximate Material Type Recognition in the Wild through Deep Learning of Spatial Surface Temperature Patterns
We introduce Deep Thermal Imaging, a new approach for close-range automatic
recognition of materials to enhance the understanding of people and ubiquitous
technologies of their proximal environment. Our approach uses a low-cost mobile
thermal camera integrated into a smartphone to capture thermal textures. A deep
neural network classifies these textures into material types. This approach
works effectively without the need for ambient light sources or direct contact
with materials. Furthermore, the use of a deep learning network removes the
need to handcraft the set of features for different materials. We evaluated the
performance of the system by training it to recognise 32 material types in both
indoor and outdoor environments. Our approach produced recognition accuracies
above 98% in 14,860 images of 15 indoor materials and above 89% in 26,584
images of 17 outdoor materials. We conclude by discussing its potentials for
real-time use in HCI applications and future directions.Comment: Proceedings of the 2018 CHI Conference on Human Factors in Computing
System